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ABSTRACT 

We consider the following scheduling problem. A system is composed of n processors drawn from a 
pool of N. The processors can become faulty while in operation and faulty processors never recover. 
A report is issued whenever a fault occurs. This report states only the existence of a fault, but does 
not indicate its location. Based on this report, the scheduler can reconfigure the system and choose 
another set of n processors. The system operates satisfactorily as long as at most / of the n selected 
processors are faulty. We exhibit a scheduling strategy allowing the system to operate satisfactorily 
until approximately (N/n)f faults are reported in the worst case. Our precise bound is tight. 

Key words: fault tolerance, maximum matching, redundancy, scheduling. 

1 Introduction 

Many control systems are subject to failures that can have dramatic effects. One simple way 
to deal with this problem is to build in some redundancy so that the whole system is able 
to function even if parts of it fail. In a general situation, the system's manager has access 
to some observations allowing it to control the system efficiently. Such observations bring 
information about the state of the system that might consist of partial fault reports. The 
available controls might include repairs and/or replacement of faulty processors. 

To model the problem, one needs to make assumptions regarding the occurrence of faults. 
Typically, they are assumed to occur according to some stochastic process. To make the 
model more tractable, one often considers the process to be memoryless, i.e. faults occur 
according to some exponential distribution. However, to be more realistic, many complica- 
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tions and variations can be introduced in the stochastic model, and they complicate the time 
analysis. Examples are: a processor might become faulty at any time or only during specific 
operations; the fault rate might vary according to the work load; faults might occur inde- 
pendently among the processors or may depend on proximity. The variations seem endless 
and the results are rarely general enough so as to carry some information or methodology 
from one model to another. 

One way to derive general results, independent of the specific assumptions about the 
time of occurrence of faults, is to adopt a logical time, that, instead of following an absolute 
frame, is incremented only at each occurrence of a fault. Within this framework, we measure 
the maximal number of faults to be observed until the occurrence of a crash instead of the 
maximal time of survival of a system until the occurrence of a crash. 

As an introduction to this general situation, we make the following assumptions and 
simplifications: 

Redundancy of the system: We assume the existence of a pool Af composed of N iden- 
tical processors from among which, at every time /, a set S t of n processors is selected 
to configure the system. The system works satisfactorily as long as at least n — f pro- 
cessors among the n currently in operation are not faulty, tolerate more than / faults 
at any given time: it stops functioning if / + 1 processors among these n processors 
are faulty. 

Occurrence of faults, reports and logical time: We consider the situation in which 
failures do not occur simultaneously and where, whenever a processor fails, a report is 
issued, stating that a failure has occurred, but without specifying the location of the 
failure. (Reporting additional information might be too expensive or time consuming.) 
Based on these reports, the scheduler might decide to reconfigure the system whenever 
such failure is reported. As a result, we restrict our attention to the discrete model, 
in which time / corresponds to the /-th failure in the system. 

Repairs: No repair is being performed. 

Deterministic Algorithms: We assume that the scheduler does not use randomness. 

Since the universe consists of only N processors, and one processor fails at each time, no 
scheduling policy can guarantee that the system survives beyond time N . (A better a priori 
upper bound is N — n + f+1: at this time, only n — f—1 processors are still non-faulty. This 
does not allow for the required quorum of n — f non-faulty processors.) But some scheduling 
policies seem to allow the system to survive longer than others. An obviously bad policy is 
to choose n processors once and for all and never to change them: the system would then 
collapse at time /+ 1. This paper investigates the problem of determining the best survival 
time. 

This best survival time is defined from a worst-case point -of- view: a given scheduler allows 
the system to survive (up to a certain time) only if it allows it to survive against all possible 
failure patterns in which one processor fails at each time. 
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Our informal description so far apparently constrains the faults to occur in on-line fash- 
ion: for each /, the /-th fault occurs before the scheduler decides the set S t +i to be used 
subsequently. However, since we have assumed that no reports about the locations of the 
faults are available, there is no loss of generality in requiring the sets St to be determined 
a priori. (Of course, in practice, some more precise fault information may be available, and 
each set St would depend on the fault pattern up to time /.) Also, as we have assumed 
a deterministic scheduler, we can assume that the decisions Si, ... , Sn are revealed before 
the occurrence of any fault. We express this by saying that the faults occur in an off-line 
fashion. 

2 The Model 

Throughout this paper, we fix a universal set Af of processors, and let N denote its cardi- 
nality. We also fix a positive integer n (n < N) representing the number of processors that 
are needed at each time period, and a positive integer / representing the number of failures 
that can be tolerated (/ < n). 

We model the situation described in the introduction as a simple game between two 
entities, a scheduler and an adversary. The game consists of only one round, in which the 
scheduler plays first and the adversary second. The scheduler plays by selecting a sequence 
of N sets of processors (the schedule), each set of size n, and the adversary responds by 
choosing, from each set selected by the scheduler, a processor to kill. We consider only 
sequences of size N because the system must collapse by time N , since, at each time period, 
a new processor breaks down. 

Formally, a schedule S is defined to be a finite sequence, Si, . . ., S^, of subsets of Af, 
such that \S t \ = n for all /, 1 < / < N . An S- adversary A is defined to be a finite sequence, 
Si, . . . , s N , of elements of Af such that s t G S t for every /. 

Now let S be a schedule, and A an 5-adversary. Define the survival time, T(S,A), to be 
the largest value of / such that, for all u < t, . . .s u } n S u \ < f. That is, for all time 
periods u up to and including time period /, there are no more than / processors in the set 
S u that have failed by time u. 

We are interested in the minimum survival time for a particular schedule, with respect to 
arbitrary adversaries. Thus, we define the minimum survival time for a schedule, T(S), to 
be T(S) = min^t T(S, A). In this definition, the minimum is taken over all ^-adversaries. 
An adversary A for which T(S) = min^ T(S,A) is said to be minimal for S. Finally, we are 
interested in determining the schedule that guarantees the greatest minimum survival time. 
Thus, we define the optimum survival time T opt , to be max 5 T(S) = max 5 min^ T(S, A). 
Also define a schedule S to be optimum provided that T(S) = T opt . Our objectives in this 
paper are to compute T opt as a function of N , n and /, to exhibit an optimum schedule, 
and to determine a minimal adversary for each schedule. 
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3 The Result 



Recall that 1 < / < n < N are three fixed integers. Our main result is stated in terms of 
the following function defined on the set of positive real numbers (see Figure 1): 
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max(i,0). In particular, h n j(k) = when n divides k. 




Figure 1. The function h n j(k) 
The main result of this paper is: 
Theorem 3.1 

T opt = h nJ (N). 

We will present our proof in two lemmas proving respectively that T opt is no smaller and 
no bigger than h n j(N). 

Lemma 3.2 

T opt > h nJ (N). 

Proof: Consider the schedule <S trivial in which the N processors are partitioned into |_"^J 
batches of n processors each and one batch of p = N — [^\ n. Each of the first |_-^-J batches 
is used / time periods and then set aside. Then, the last batch of processors along with 
any n — p of the processors set aside is used for (/ + p — n) + time periods. It is easy to see 
that no adversary can succeed in killing / + 1 processors within a batch before this schedule 
expires. ■ 

In order to prove the other direction of Theorem 3.1, we need the following result about 
the rate of increase of the function h n j(k). 
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Lemma 3.3 For < k and < / < n we have h n j(k) < h n j(k + /) + n — I — f. 

Proof: Notice first that h n j(k) = h n j(k + n) — f for all k > 0. Moreover, the function 
h increases at a sublinear rate (see Figure 1) so that, for p,q > 0, we have h n j(p + q) < 
hnj(p) + Q- Letting p = k + / and q = n — I, we obtain 

Kj(k) = h nJ (k + n) - f < h nJ (k + /) + n - I - f, 

which proves the lemma. ■ 



4 The Upper Bound 

In this section we establish the other direction of the main theorem. We begin with some 
general graph theoretical definitions. 

Definition 4.1 

• For every vertex v of a graph 6?, we let Jg( v ) denote the set of vertices adjacent to v. 
We can extend this notation to sets: for all sets C of vertices 7g(C) == ^v^c1g{ v )- 

• For every bipartite graph 6?, v{G) denotes the size of a maximum matching of G. 

• For every pair of positive integers L,R, a left totally ordered bipartite graph of size 
(L,R) is a bipartite graph with bipartition £,1Z, where £ is a totally ordered set of 
size L and 1Z is a set of size R. We label £ = {a l5 . . . , a L } so that, a, < a, for every 
1 ^ i < j ^ L. For every CJ C £ and 1Z' C 1Z, the subgraph induced by £' and 1Z' is a 
left totally ordered bipartite graph with the total order on £ inducing the total order 
on £'. 

• Let 6? be a left totally ordered bipartite graph of size (L,R). For t = 1,...,L, 

we let It(G) denote the left totally ordered subgraph of G induced by the subsets 
{a 1 ,a 2 , . . . , a t _i} C £ and -) G {a t ) C 1Z. 

Let us justify quickly the notion of left total order. In this definition, we have in mind 
that £ represents the labels attached to the different times, and that 1Z represents the 
labels attached to the available processors. The times are naturally ordered. The main 
argument used in the proof is to reduce an existing schedule to a shorter one. In doing so, 
we in particular select a subsequence of times. Although these times are not necessarily 
consecutive, they are still naturally ordered. The total order on £ is the precise notion 
formalizing the ordering structure characterizing time. 

Consider a finite schedule S = Si, . . . , St- In graph theoretic terms, it can be represented 
as a left totally ordered bipartite graph G with bipartition T = {1,2, ... ,T} and Af = 
{1,2, . . . , N}. There is an edge between vertex / £ T and vertex i £ Af if the processor i is 
selected at time /. The fact that, for all t, \S t \ = n translates into the fact that vertex / £ T 
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has degree n. For such a bipartite graph, the game of the adversary consists in selecting one 
edge incident to each vertex / £ T. 

Observe that the adversary can kill the schedule at time / if it has already killed, before 
time /, / of the n processors used at time /. It then kills another one at time / and the 
system collapses. In terms of the graph G, there exists an adversary that kills the schedule 
at time / if and only if the subgraph It(G) has a matching of size /, i.e. v(I t (G)) > /. 
Therefore, the set V that we now define represents the set of integers L and R for which 
there exists a schedule that survives at time L, when R processors are available. 

Definition 4.2 Let L and R be two positive integers. (L,R) £ V iff there exists a left 
totally ordered bipartite graph G of size (L, R) with bipartition £ and 1Z satisfying the two 
following properties: 

1. All vertices in £ have degree exactly equal to n, 

2. For every / = 1, . . ., |£|, all matchings in It(G) have size at most equal to / — 1, i.e. 
u{I t {G))<f-l. 

The main tool used in the proof of Theorem 3.1 is the following duality result for the 
maximum bipartite matching problem, known as Ore's Deficiency Theorem [3]. A simple 
proof of this theorem and related results can be found in [2]. 

Theorem 4.1 Let G be a bipartite graph with bipartition A and B. Then the size v{G) of 
a maximum matching is given by the formula: 

v(G) = mm[\B-C\ + \ lG (C)\\. (1) 

The following lemma is crucial for our proof. 

Lemma 4.2 There are no positive integers L and R such that (L,R) £ V and such that 
L > h nJ (R). 

Proof: 

Working by contradiction, consider two positive integers L and R such that (L,R) £ V 
and L > h n j(R). We first show the existence of two integers L 1 and R 1 such that L 1 < L, 
{L',R') £ V and L' > h nJ (R'). 

Let £ = {a l5 a 2 , . . . , a L } and 1Z = {&i, b 2 , ■ ■ ■ , b R } be the bipartition of the graph G whose 
existence is ensured by the hypothesis (L,R) £ V. 

We apply Theorem 4.1 to the graph Il{G) where we set A = {a l5 a 2 , . . . , a L _i} and 
B = 7g(«l)- Let C denote a subset of B for which the minimum in (1) is attained. (C is 
possibly empty.) Define £' = £ — ({a L } U 7/ i (G)(C)) an( i 7Z' = 1Z — C and let L' and R' 
denote the cardinalities of £' and 1Z' . Hence, L 1 = L — 1 — \^ Il ( G - } (C)\ so that L 1 < L. Consider 
the bipartite subgraph G' of G induced by the set of vertices £'UlZ' . In other words, in order 
to construct G' from G, we remove the set C U {a L } of vertices and all vertices adjacent to 
some vertex in C . We have illustrated this construction in Figure 2. In that specific example, 
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n = 4, / = 3, L = 6 and i? = 7, while h A 3 (l) = 5. One can show that C = {6 5 , 6 65 ^7} an d 
as a result 6?' is the graph induced by the vertices {a l5 a 2 , 6(3, a.4, 61, 6 2 ? ^4}- The graph G' 
has size {L',R') = (4,4). 



G G' 




Figure 2. An example of the construction of G' from G. The vertices in C are darkened. 

We first show that (L',R r ) £ P. Since the vertices in £' correspond to the vertices 
of £ — {a L } not connected to C, their degree in G' is also n. Furthermore, 6?', being a 
subgraph of 6?, inherits property 2 of Definition 4.2. Indeed, assume that there is a vertex 
a t / in G' such that I t i(G') has a matching of size /. Let / be the label of the corresponding 
vertex in graph G. Since the total order on CJ is induced by the total order on C, I t i(G r ) is 
a subgraph of It{G). Therefore, It(G) would also have a matching of size /, a contradiction. 

Let us show that L 1 > h n j(R'). The assumption (L, R) £ V implies that f—1 > v{Il{G)). 
Using Theorem 4.1 and the fact that B = Jg(L) has cardinality n, this can be rewritten as 

f-l > v(I L (G))=\B-C\ + \ llLiG) (C)\ 

= n-\C\ + \ llL(G) (C)\. (2) 

Since C C B C 1Z, we have that < \C\ < n < R and, thus, the hypotheses of Lemma 3.3 
are satisfied for k = R — \C\ and / = \C\. Therefore, we derive from the lemma that 

h nJ (R') = h nJ (R - \C\) < Kj(R) + n -\C\-f. 
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Using (2), this implies that 

h nJ (R')<h nJ (R)-\ llL(G) (C)\-l. 

By assumption, L is strictly greater than h n j(R), implying 

h n j(R')<L-l-\ llL(G) (C)\. 

But the right-hand-side of this inequality is precisely L' , implying that L 1 > h n j(R r ). 

We have therefore established that for all integers L and R such that (L,R) £ V and 
L > h n j(R), there exists two integers L 1 and R 1 such that L 1 < L, (L',R r ) £ V and 
L 1 > h n j(R r ). Among all such pairs (L,R), we select the pair for which L is minimum. 
By the result that we just established, we obtain a pair (L',R r ) such that (L',R r ) £ V and 
L 1 < L. This contradicts the minimality of L. 

■ 

Lemma 4.3 

T opt < h nJ (N). 

Proof: By assumption, (T opt ,N) £ V. Hence this result is a direct consequence of 
Lemma 4.2 . 

■ 

This Lemma along with Lemma 3.2 proves Theorem 3.1. 

In the process of proving Lemma 3.2 we proved that ^iv^ is an optimum schedule. On 
the other hand, the interpretation of the problem as a graph problem also demonstrates that 
the adversary has a polynomial time algorithm for finding an optimum killing sequence for 
each schedule S . When provided with S , the adversary needs only to compute a polynomial 
number (actually fewer than N) of maximum bipartite matchings, for which well known 
polynomial algorithms exist (for the fastest known, see [1]). 

5 Future Research 

The problem solved in this paper is a first step towards modeling complex resilient systems 
and there are many interesting extensions. We mention only a few. 

An interesting extension is to consider the case of a system built up of processors of 
different types. For instance consider the case of a system built up of a total of n processors, 
that is reconfigured at each time period and that needs at least gi non-faulty processors 
of type 1 and at least g 2 non-faulty processors of type 2 in order to function satisfactorily. 
Assume also that these processors are drawn from a pool ATi of Ni processors of type 1 and 
a pool N~2 of N 2 processors of type 2, that Ni fl N-i = 0, that that there are no repairs. It is 
easy to see that the optimum survival time T opt is at least the survival time of every strategy 
for which the number of processors of type 1 and type 2 is kept constant throughout. Hence: 

T opt > max min(/i„ li „ 1 _ 3l (iVi),/i„ 2i „ 2 _ 32 (iV 2 )). 

{(ni,n 2 );ni+n 2 =n} 
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It would be an interesting question whether T opt is exactly equal to this value or very close 
to it. 

Extend the definition of a scheduler to represent a randomized scheduling protocol. 
(Phrased in this context, the result presented in this paper is only about deterministic 
scheduling protocols.) A scheduler is called adversary-oblivious if it decides the schedule 
independently of the choices Si,s 2 , ■ ■ ■ made by the adversary. An off-line adversary is an 
adversary that has access to the knowledge of the full schedule Si, S 2 , ■ ■ ■ before deciding 
the full sequence Si,s 2 , • • • Note that, by definition, off-line adversaries make sense only 
with adversary-oblivious schedulers. By comparison, an on-line adversary decides for each 
time / which processor s t to kill, without knowing the future schedule: at each time / the 
adversary decides s t based on the sole knowledge of Si, . . . , S t and of s l5 . . . , s t _i. In this 
more general framework, the quantity we want to determine is 

T opt = maxmin E [T(S, A)] . (3) 

For an adversary-oblivious, randomized scheduler, one can consider two cases based on 
whether the adversary is on-line or off-line. As is easily seen, if the adversary is off-line, 
randomness does not help in the design of optimal schedulers: introducing randomness in 
the schedules cannot increase the survival time if the adversary gets full knowledge of the 
schedule before committing to any of its choices. As a result, the off-line version corresponds 
to the situation investigated in this paper. 

It would be of interest to study the online version of Problem (3). On-line adversaries 
model somewhat more accurately practical situations: faults naturally occur in an on-line 
fashion and the role of the program designer is then to design a scheduler whose expected 
performance is optimum. Hence, comparing the two versions of Problem 3 would allow to 
understand how much randomness can help in the design of optimum, adversary-oblivious, 
schedulers. 

For instance, in the case where N = 4, n = 2 and / = 1, and where on-line adversaries 
are considered, a case analysis shows that T opt is equal to 9/4 for randomized algorithms. 
A direct application of Theorem 3.1 shows that that T opt = 2 for deterministic algorithms. 

Going towards even more realistic and complex situations, we can also study the case 
where the scheduler is provided at each time with some partial information about the fault 
process. 
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